Goto

Collaborating Authors

 moco alone underperform


MoCo alone underperforms because it treats

Neural Information Processing Systems

MoCo is good at unsupervised pre-training but its resulting networks need finetuning with (pseudo) class labels. G GPU memory, 200,000+ instances can be easily stored. We added experiments on MSMT17 as suggested. We will look into more theories in future studies. Our self-paced strategy dynamically determines confident clusters and un-clustered instances.